Search CORE

379 research outputs found

Efficient comparison based string matching

Author: Breslauer D. (Dany)
Galil Z.
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/1993
Field of study

Finding 2-Edge and 2-Vertex Strongly Connected Components in Quadratic Time

Author: AL Buchsbaum
GF Italiano
H Nagamochi
H Nagamochi
HN Gabow
HN Gabow
J Bang-Jensen
JA Bondy
JE Hopcroft
K Chatterjee
L Georgiadis
M Henzinger
MR Henzinger
RE Tarjan
RE Tarjan
S Even
S Makino
Z Galil
Z Galil
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/05/2015
Field of study

We present faster algorithms for computing the 2-edge and 2-vertex strongly connected components of a directed graph, which are straightforward generalizations of strongly connected components. While in undirected graphs the 2-edge and 2-vertex connected components can be found in linear time, in directed graphs only rather simple

O(m n)

-time algorithms were known. We use a hierarchical sparsification technique to obtain algorithms that run in time

O(n^2)

. For 2-edge strongly connected components our algorithm gives the first running time improvement in 20 years. Additionally we present an

O(m^2 / \log{n})

-time algorithm for 2-edge strongly connected components, and thus improve over the

O(m n)

running time also when

m = O(n)

. Our approach extends to k-edge and k-vertex strongly connected components for any constant k with a running time of

O(n^2 \log^2 n)

for edges and

O(n^3)

for vertices

arXiv.org e-Print Archive

Crossref

Detecting One-variable Patterns

Author: A Amir
A Ehrenfeucht
D Angluin
D Kosolobov
D Kosolobov
E Czeizler
F Manea
G Manacher
J Kärkkäinen
JEF Friedl
M Crochemore
M Crochemore
M Lothaire
M Rubinchik
ML Schmid
P Gawrychowski
Z Galil
Z Xu
Publication venue
Publication date: 01/01/2017
Field of study

Given a pattern

p = s_1x_1s_2x_2\cdots s_{r-1}x_{r-1}s_r

such that

x_1,x_2,\ldots,x_{r-1}\in\{x,\overset{{}_{\leftarrow}}{x}\}

, where

x

is a variable and

\overset{{}_{\leftarrow}}{x}

its reversal, and

s_1,s_2,\ldots,s_r

are strings that contain no variables, we describe an algorithm that constructs in

O(rn)

time a compact representation of all

P

instances of

p

in an input string of length

n

over a polynomially bounded integer alphabet, so that one can report those instances in

O(P)

time.Comment: 16 pages (+13 pages of Appendix), 4 figures, accepted to SPIRE 201

arXiv.org e-Print Archive

Crossref

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

Searching of gapped repeats and subrepetitions in a word

Author: D. Gusfield
G. Brodal
J. Storer
M. Crochemore
M. Crochemore
M. Crochemore
M. Crochemore
P. Emde Boas van
R. Kolpakov
R. Kolpakov
R. Kolpakov
T. Kociumaka
Z. Galil
Publication venue
Publication date: 29/09/2013
Field of study

A gapped repeat is a factor of the form

uvu

where

u

and

v

are nonempty words. The period of the gapped repeat is defined as

|u|+|v|

. The gapped repeat is maximal if it cannot be extended to the left or to the right by at least one letter with preserving its period. The gapped repeat is called

\alpha

-gapped if its period is not greater than

\alpha |v|

. A

\delta

-subrepetition is a factor which exponent is less than 2 but is not less than

1+\delta

(the exponent of the factor is the quotient of the length and the minimal period of the factor). The

\delta

-subrepetition is maximal if it cannot be extended to the left or to the right by at least one letter with preserving its minimal period. We reveal a close relation between maximal gapped repeats and maximal subrepetitions. Moreover, we show that in a word of length

n

the number of maximal

\alpha

-gapped repeats is bounded by

O(\alpha^2n)

and the number of maximal

\delta

-subrepetitions is bounded by

O(n/\delta^2)

. Using the obtained upper bounds, we propose algorithms for finding all maximal

\alpha

-gapped repeats and all maximal

\delta

-subrepetitions in a word of length

n

. The algorithm for finding all maximal

\alpha

-gapped repeats has

O(\alpha^2n)

time complexity for the case of constant alphabet size and

O(n\log n + \alpha^2n)

time complexity for the general case. For finding all maximal

\delta

-subrepetitions we propose two algorithms. The first algorithm has

O(\frac{n\log\log n}{\delta^2})

time complexity for the case of constant alphabet size and

O(n\log n +\frac{n\log\log n}{\delta^2})

time complexity for the general case. The second algorithm has

O(n\log n+\frac{n}{\delta^2}\log \frac{1}{\delta})

expected time complexity

arXiv.org e-Print Archive

Crossref

Pattern Matching in Multiple Streams

Author: A. Amir
D. Breslauer
F. Ergun
G.M. Landau
G.M. Landau
H. Karloff
K. Abrahamson
M. Ružić
R. Clifford
R. Clifford
R. Clifford
R. Clifford
R. Clifford
T.S. Jayram
Z. Galil
Publication venue
Publication date: 01/01/2012
Field of study

We investigate the problem of deterministic pattern matching in multiple streams. In this model, one symbol arrives at a time and is associated with one of s streaming texts. The task at each time step is to report if there is a new match between a fixed pattern of length m and a newly updated stream. As is usual in the streaming context, the goal is to use as little space as possible while still reporting matches quickly. We give almost matching upper and lower space bounds for three distinct pattern matching problems. For exact matching we show that the problem can be solved in constant time per arriving symbol and O(m+s) words of space. For the k-mismatch and k-difference problems we give O(k) time solutions that require O(m+ks) words of space. In all three cases we also give space lower bounds which show our methods are optimal up to a single logarithmic factor. Finally we set out a number of open problems related to this new model for pattern matching.Comment: 13 pages, 1 figur

arXiv.org e-Print Archive

Crossref

Warwick Research Archives Portal Repository

Faster Approximate String Matching for Short Patterns

Author: A. Andersson
A.H. Wright
D. Gusfield
D. Harel
D.E. Knuth
E. Ukkonen
E. Ukkonen
E.W. Myers
F.T. Leighton
G. Myers
G. Navarro
G.M. Landau
H. Hyyrö
K.E. Batcher
M. Farach-Colton
M.A. Bender
P. Bille
P. Sellers
Philip Bille
R. Baeza-Yates
R. Cole
R.A. Baeza-Yates
R.A. Wagner
S. Albers
S. Alstrup
S. Wu
S.C. Sahinalp
T. Hagerup
T.H. Cormen
V.L. Arlazarov
W. Masek
Z. Galil
Z. Galil
Publication venue
Publication date: 17/03/2011
Field of study

We study the classical approximate string matching problem, that is, given strings

P

and

Q

and an error threshold

k

, find all ending positions of substrings of

Q

whose edit distance to

P

is at most

k

. Let

P

and

Q

have lengths

m

and

n

, respectively. On a standard unit-cost word RAM with word size

w \geq \log n

we present an algorithm using time

O(nk \cdot \min(\frac{\log^2 m}{\log n},\frac{\log^2 m\log w}{w}) + n)

When

P

is short, namely,

m = 2^{o(\sqrt{\log n})}

m = 2^{o(\sqrt{w/\log w})}

this improves the previously best known time bounds for the problem. The result is achieved using a novel implementation of the Landau-Vishkin algorithm based on tabulation and word-level parallelism.Comment: To appear in Theory of Computing System

arXiv.org e-Print Archive

Crossref

Online Research Database In Technology

An Optimal Algorithm for Tiling the Plane with a Translated Polyomino

Author: A Blondin-Massé
C Goodman-Strauss
D Beauquier
D Gusfield
D Schattschneider
D Schattschneider
DE Knuth
HAG Wijshoff
HD Shapiro
L Gambini
N Ollinger
S Brlek
SW Golomb
SW Golomb
Z Galil
Publication venue
Publication date: 01/01/2015
Field of study

We give a

O(n)

-time algorithm for determining whether translations of a polyomino with

n

edges can tile the plane. The algorithm is also a

O(n)

-time algorithm for enumerating all such tilings that are also regular, and we prove that at most

\Theta(n)

such tilings exist.Comment: In proceedings of ISAAC 201

arXiv.org e-Print Archive

Crossref

DI-fusion

Online Detection of Repetitions with Backtracking

Author: A Apostolico
D Breslauer
D Breslauer
H Leung
J Jansson
JJ Hong
M Crochemore
MG Main
Z Galil
Publication venue
Publication date: 01/01/2015
Field of study

In this paper we present two algorithms for the following problem: given a string and a rational

e > 1

, detect in the online fashion the earliest occurrence of a repetition of exponent

\ge e

in the string. 1. The first algorithm supports the backtrack operation removing the last letter of the input string. This solution runs in

O(n\log m)

time and

O(m)

space, where

m

is the maximal length of a string generated during the execution of a given sequence of

n

read and backtrack operations. 2. The second algorithm works in

O(n\log\sigma)

time and

O(n)

space, where

n

is the length of the input string and

\sigma

is the number of distinct letters. This algorithm is relatively simple and requires much less memory than the previously known solution with the same working time and space. a string generated during the execution of a given sequence of

n

read and backtrack operations.Comment: 12 pages, 5 figures, accepted to CPM 201

arXiv.org e-Print Archive

Crossref

Institutional repository of Ural Federal University named after the first President of Russia B.N.Yeltsin

Ternary Syndrome Decoding with Large Weight

Author: A Becker
A Becker
A May
AD Flaxman
C Peters
CT Gueye
D Wagner
E Berlekamp
E Prange
J Stern
JT Coffey
M Chaimovich
N Howgrave-Graham
N Sendrier
NT Courtois
S Hirose
T Johansson
Z Galil
Publication venue
Publication date: 14/06/2019
Field of study

The Syndrome Decoding problem is at the core of many code-based cryptosystems. In this paper, we study ternary Syndrome Decoding in large weight. This problem has been introduced in the Wave signature scheme but has never been thoroughly studied. We perform an algorithmic study of this problem which results in an update of the Wave parameters. On a more fundamental level, we show that ternary Syndrome Decoding with large weight is a really harder problem than the binary Syndrome Decoding problem, which could have several applications for the design of code-based cryptosystems

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Cryptology ePrint Archive

Palindromic Decompositions with Gaps and Errors

Author: A Apostolico
A Frid
D Breslauer
D Gusfield
D Kosolobov
DE Knuth
G Fici
G Manacher
M Crochemore
M Crochemore
M Rubinchik
R Kolpakov
S Gupta
T I
X Droubay
X Droubay
Y Fujishige
Z Galil
Publication venue
Publication date: 27/03/2017
Field of study

Identifying palindromes in sequences has been an interesting line of research in combinatorics on words and also in computational biology, after the discovery of the relation of palindromes in the DNA sequence with the HIV virus. Efficient algorithms for the factorization of sequences into palindromes and maximal palindromes have been devised in recent years. We extend these studies by allowing gaps in decompositions and errors in palindromes, and also imposing a lower bound to the length of acceptable palindromes. We first present an algorithm for obtaining a palindromic decomposition of a string of length n with the minimal total gap length in time O(n log n * g) and space O(n g), where g is the number of allowed gaps in the decomposition. We then consider a decomposition of the string in maximal \delta-palindromes (i.e. palindromes with \delta errors under the edit or Hamming distance) and g allowed gaps. We present an algorithm to obtain such a decomposition with the minimal total gap length in time O(n (g + \delta)) and space O(n g).Comment: accepted to CSR 201

arXiv.org e-Print Archive

Crossref